statisticalfandomcom-20200214-history
Point estimation
point estimation is the estimation of a value of the parameters of a statistics model by a single number, a point estimate, for each paramter. @todo It is assumed that some set of observations x1,x2,x3 arises as n independent observations on some known distribution, with unknown parameters \theta A point estimate is the observed value of a point estimator. An estimate of the value of an unknown parameter of a statistical model. A point estimator is a function of the random variables X_1, X_2, X_3 . Point Estimator a point estimator is a function of the observed values Maximum likelihood estimation MLE is a method for obtaining a point estimator for {\theta} for the unknown parameters \hat{\theta} This method tries to obtain the maximum likelihood estimator of an unknown parameter using calculus f(x|\theta) to denote the probabilty density function associated with X of the pmf if X is discrete Likelihood function @todo referred to as the likelihood for short. the likelihood function gives a measure of how likely a value of \theta is given that is is known that the sample X_1, X_2 has the values x_1, x_2 , ... L(\theta) = f(x_1|\theta) \times f(x_2|\theta) \times ... \times f(x_n|\theta) = \prod_{i=1}^n f(x_i|\theta) L(\theta) is a function of the parameter space and not the x the best estimator of \theta is one that maximises the likelihood function For any differentiatable likelihood, similarly to any function, such that its maxima can be determined by the use of calculus, so that; {d \over d\theta } L(\theta) = L'(\theta)=0 so you are looking to determine which of these points are maxima, minima, and which are saddlepoints, by taking the 2nd derivative like so; {d^2 \over d\theta^2 } L(\theta) = L''(\theta) \theta = \theta if L''(\theta) < 0 then \theta hat corresponds to a maximum Log-Likelihood function The log-likelihood is similarly defined as l(\theta) = log L(\theta) with log to base e the maximum likelihood estimator, is the maximiser of the L(\theta) is also the maximiser of the log-likelihood log L(\theta) = l(\theta) It is usually easier to work with log likelihood because the logs convert the product terms into addition terms, and addition terms are easier to work with. Maximum log-likelihood estimation l(theta) = log L(theta) the log-likelihood is also maxumized at the same point as the likelihood If the likelihood function is not differentiatable for its range, then the MLE cannot be found using calculus. proof of why the likelihood and the log-likelihood have the same maximizer l(\theta) maximiser l(\theta) = maximiser log l(\theta) l(\theta) = log L(\theta) = log \prod_{i=1}^n f(x_i|\theta) = \sum_{i=1}^n log f(x_i|\theta) estimating Normal parameters @todo The Cramer-Rao lower bound the minimum possible variance that it is possible to achieve with any unbiased estimator is defined by the ‘Cramér–Rao lower bound’, often abbreviated to CRLB Notes It is also possible that the log-likelihood function is not differentiable everywhere in the interior of the parameter space, in which case the calculus route to maximum likelihood estimation fails. $\prod_{test}^1$ Properties of maximum likelihood estimators unbiasedness minimum variance unbiased estimator